On Uncertainty Measures Used for Decision Tree Induction
نویسنده
چکیده
This paper provides a further look at uncertainty or information criteria used in the context of decision tree induction, and more generally of learning conditional class probability models. We show the high degree of similarity among two main families of criteria based respectively on the logarithmic SHANNON entropy function and the quadratic GINI index. We start by introducing a general family of entropy functions and then discuss the latter particular cases, and end up with a short review of the Kolmogorov-Smirnov distance, another related measure. 1 Generalized Information Functions The concept of generalized information functions of type β was first introduced by Daróczy [1] and its use for pattern recognition problems was discussed by Devijver [2]. The entropy of type β (β positive and different from 1) of a discrete probability distribution (p1, . . . , pm) is defined by H(p1, . . . , pm) △ = m
منابع مشابه
MMDT: Multi-Objective Memetic Rule Learning from Decision Tree
In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...
متن کاملComparing different stopping criteria for fuzzy decision tree induction through IDFID3
Fuzzy Decision Tree (FDT) classifiers combine decision trees with approximate reasoning offered by fuzzy representation to deal with language and measurement uncertainties. When a FDT induction algorithm utilizes stopping criteria for early stopping of the tree's growth, threshold values of stopping criteria will control the number of nodes. Finding a proper threshold value for a stopping crite...
متن کاملUncertainty Measurement for Ultrasonic Sensor Fusion Using Generalized Aggregated Uncertainty Measure 1
In this paper, target differentiation based on pattern of data which are obtained by a set of two ultrasonic sensors is considered. A neural network based target classifier is applied to these data to categorize the data of each sensor. Then the results are fused together by Dempster–Shafer theory (DST) and Dezert–Smarandache theory (DSmT) to make final decision. The Generalized Aggregated Unce...
متن کاملDIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION
Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...
متن کاملLearning ELM-Tree from big data based on uncertainty reduction
A challenge in big data classification is the design of highly parallelized learning algorithms. One solution to this problem is applying parallel computation to different components of a learning model. In this paper, we first propose an extreme learning machine tree (ELM-Tree) model based on the heuristics of uncertainty reduction. In the ELM-Tree model, information entropy and ambiguity are ...
متن کامل